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Abstract 

Rollout algorithms have demonstrated excellent performance on a variety of dynamic and discrete 
optimization problems. While in many cases rollout algorithms are guaranteed to perform as well as their 
base policies, there have been few theoretical results showing additional improvement in performance. In 
this paper we perform a probabilistic analysis of the subset sum problem and knapsack problem, giving 
theoretical evidence that rollout algorithms perform strictly better than their base policies. Using a 
stochastic model from the existing literature, we analyze two rollout methods that we refer to as the 
consecutive rollout and exhaustive rollout, both of which employ a simple greedy base policy. For the 
subset sum problem, we prove that after only a single iteration of the rollout algorithm, both methods 
yield at least a 30% reduction in the expected gap between the solution value and capacity, relative to 
the base policy. Analogous results are shown for the knapsack problem. 
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1 Introduction 



Rollout algorithms provide a natural and easily implemented approach for approximately solving many 
discrete and dynamic optimization problems. Their motivation comes from problems that may be solved 
using classical dynamic programming, but for which determining the value function (or value-to-go function) 
is computationally infeasible. The rollout technique estimates these values by simulating future events while 
following a simple greedy/heuristic policy, referred to as the base policy. In most cases the rollout algorithm 
is ensured to perform as well as its base policy [2]. As shown by many computational studies, the performance 
is often much better than the base policy, and sometimes near optimal [4]. 

Theoretical results showing a strict improvement of rollout algorithms over base policies have been limited 
to average-case asymptotic bounds on the breakthrough problem and a worst-case analysis of the knapsack 
problem [3J [T] . The latter work motivates a complementary study of rollout algorithms for knapsack-type 
problems from an average-case perspective, which we provide in this paper. Our goals are to give theoretical 
evidence for the utility of rollout algorithms and to contribute to the knowledge of problem types and features 
that make rollout algorithms work well. We anticipate that our proof techniques may be helpful in achieving 
performance guarantees on similar problems. 

The knapsack problem is given by a set of n items with weights Wi, i = 1 . . . , n, profits Pi, i = 1 . . . , n, 
and a knapsack with capacity b. The goal is to select a subset of items with maximum net profit while 
ensuring that the net weight does not exceed the capacity. The subset sum problem refers to instances 
where each item has a profit equal to its weight. Using an average-case perspective, it is assumed that the 
item weights, item profits, and capacity are drawn randomly from underlying distributions. The algorithms 
being analyzed do not have or use any knowledge of these distributions. While rollout algorithms are often 
used for online dynamic problems, we focus strictly on offline problems where all information is given upfront 
(as opposed to being revealed over time). 

We use a stochastic model directly from the literature that has been used to study a wide variety of greedy 
algorithms for the subset sum problem [6]. In our analysis, this model is extended in a natural manner for 
the knapsack problem. We analyze two rollout techniques which we refer to as the consecutive rollout and 
the exhaustive rollout, both of which use the same base policy. The first algorithm sequentially processes 
the items and at each iteration decides if the current item should be added to the knapsack. During each 
iteration of the exhaustive rollout, the algorithm decides which one of the available items should be added 
to the knapsack. The base policy is a simple greedy algorithm that adds items until an infeasible item is 
encountered. 

For both techniques, we derive bounds showing that the expected performance of the rollout algorithms 
is strictly better than the performance obtained by only using the base policy. For the subset sum problem, 
this is demonstrated by measuring the gap between the total value of packed items and capacity. For the 
knapsack problem, the difference between net profits of the rollout algorithm and base policy is measured. 
The bounds are given as a general function of the number of items n, yielding useful bounds for small 
instances and also describing asymptotic performance. 

The organization of the paper is as follows. In the remainder of this section, we describe the stochastic 
models and algorithms in detail, state our results, and review related work. In Section [2] we describe 
important properties of the blind greedy algorithm, which is the algorithm we use for a base policy. Proofs 
for the consecutive rollout and the exhaustive rollout are given in Section [3] and Section |1J respectively. 
Section [5] gives concluding remarks, and evaluations of some integrals used in the proofs are provided in the 
Appendix (Section [A]). 

1.1 Model and results 

In the knapsack problem, we are given a set / of n items where each item i £ I has a weight Wi G K+ and 
profit pi € R+. Given a capacity b € R+, the goal is to select a subset of items with maximum total profit 
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and total weight that does not exceed the capacity. This is given by the following integer linear program. 



max 



i=l 



S.t. WiXi < b 



(1) 



e{o,i} i=i,. 



,n. 



For the subset sum problem, we simply have pi = iWj for all i € I. 

We use the stochastic subset sum model given by Borgwardt and Tremel [B] , and a variation of this model 
for the knapsack problem. In their subset sum model, for a specified number of items n, item weights Wi 
and the capacity B are drawn independently from the following distributions: 



Wi 
B 



W[0,1], i = l, 
U[0,n], 



.,n, 



(2) 



where the notation U[x,y] indicates the uniform probability distribution function on interval [x, y]. Our 
stochastic knapsack model simply assigns item prohts that are independently and uniformly distributed, 



W[0,1], i = l, 



(3) 



These values are also independent with respect to the weights and capcity. 

For evaluating performance, we only consider cases where Yl^j Wi > B. In all other cases, any algorithm 
that tries adding all items is optimal. Since it is difficult to understand the stochastic nature of optimal 
solutions, we use E[B — J2ies Wi\ J2iei Wi > B] as a performance metric for the subset sum problem, where 
S is the set of items selected by the algorithm of interest. This is consistent with [6], where they note with 
a simple symmetry argument that for all values of n, 



Y^w >b 

.iei 



(4) 



Algorithm 1 Blind Greedy 

Input: Item set /, capacity b. 

Output: Feasible solution set S, value U. 

1: Initialize solution set S <— 0, remaining capacity 6^—6, and value U <— 0. 

2: for i = 1 to n (each item) do 

3: if Wi < b (item weight does not exceed remaining capacity) then 
4: Add item i to solution set, S -s— 5U{i}. 

5: Update remaining capacity b b — Wi, and value U ^— U + Pi. 

6: else 

7: Stop and return S, U. 

8: end if 
9: end for 
10: Return S, U. 



For the knapsack problem, we directly measure the difference between the rollout algorithm profit and the 
profit of the blind greedy algorithm - this is referred to as the gain of the rollout algorithm. 

For both the subset sum problem and the knapsack problem, we use the blind greedy algorithm, shown 
in Algorithm [l] as a base policy. The algorithm simply adds items (without sorting) until it encounters an 
item that exceeds the remaining capacity, then stops. With all algorithm descriptions in this paper, we show 
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the algorithm for the knapsack problem, and the description holds for the subset sum problem with pi = Wi. 
We refer to the first item that is infeasible as the critical item. Let K be the random variable for the index of 
the critical item, where K = indicates that there is no critical item (meaning Yliei Wi < B). Equivalently, 
assuming Y^iei Wi > B, the critical item index satisfies X^i" 1 Wi < B < ^ i=1 Wi- For K > 0, the gap G 
of the blind greedy algorithm is given by 



K-l 

G = B-Y J W l . 

i=l 



(•5) 



The gap is relevant to both the subset sum problem and the knapsack problem. For the knapsack problem, 
we define the gain of the rollout algorithm as 



Z = 



ies E 



(6) 



where Sr is the set of items selected by the rollout algorithm. A central result of |5] is the following. 

Theorem 1 (Borgwardt and Tremel, 1991). Independent of the critical item K > 0, the probability distri- 
bution of the gap obtained by the blind greedy algorithm satisfies 



G<g 



Wi > B J = 2g 

iel ) 



0< ff < 1, 



(7) 



E 



G 



W i > B 



iei 



(8) 



This holds for both our knapsack problem and the subset sum problem due to the nature of the algorithm. 
We show an alternative proof of this result in the following section. Note that these expressions do not 
depend on the number of items n. 



Algorithm 2 Consecutive rollout 
Input: Ordered set /, capacity b. 
Output: Feasible solution set S, value U. 

1: Initialize solution set S 0, remaining item set I <— I, remaning capacity b <— b, and value U 0. 

2: for i = 1 to n (each item) do 

3: Estimate the value of adding item i, (S + , U + ) = BlindGreedy(J, b). 

4: Estimate the value of not adding item i, (5_, = BlindGreedy(I \ {i}, b). 

5: if U + > U- (estimated value of adding the item is larger) then 

6: Add item i to solution set, S -s— S U {i}. 

7: Update remaining capacity, b <— b — u>i, and value, U <— U + pi. 

8: end if 

9: Remove item i from the remaning item set, I <— I\ {i}. 
10: end for 
11: Return S, U. 



The consecutive rollout algorithm is shown in Algorithm [2] The algorithm takes as input an ordered set 
of items /, capacity b, and makes calls to the blind greedy algorithm as a subroutine. The ordered set does 
not imply that items are sorted in any fashion up front; it is simply needed so that the algorithm keeps track 
of which items it has processed. At iteration i, the algorithm calculates the value (U+) of adding item i to 
the solution and using the blind greedy algorithm on the remaining set, and the value (U-) of not adding 
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the item to the solution and using the blind greedy algorithm thereafter. The item is then added to the 
solution only if the former valuation (E/+) is larger. 

Our analysis only focuses on the result of the first iteration of the algorithm; bounds of interest (upper 
bound for the subset sum gap, lower bound for the knapsack gain) from the first iteration are valid for future 
iteration^] A single iteration of the consecutive rollout effectively takes the best of two solutions, the one 
obtained by the greedy algorithm and the solution obtained from using the greedy algorithm after removing 
the first item. Let V* denote the gap obtained by a single iteration of the rollout algorithm. 



Theorem 2. For the subset sum problem with n > 3, 
the consecutive rollout algorithm satisfies 



the gap V* obtained by running a single iteration of 



5n 



30n 



< E 



V* 



^Wi> B 



iei 



3 + 13n 7 „„„„ 

< < — ~ 0.233. 

~ 60?i ~ 30 



(9) 




Figure 1: Performance bounds and simulated values of the expected gap V* after running a single iteration 
of the consecutive rollout algorithm on the subset sum problem. The bounds correspond to Theorem [2] For 
each n, the mean gap is shown for 10 5 simulations. The expected gap of the blind greedy algorithm is | for 
all n. 



As expected, there is not a strong dependence on n for this algorithm. The upper bound is tight for 
n = 3, where it evaluates to ^ ~ 0.233. It is also clear that limn^oo E[V*|-] < || ~ 0.217. The bounds are 
shown with simulated performance in Figure [l] The upper bound is close in comparison to the simulated 
performance, and significantly better than the greedy algorithm. A similar result holds for the knapsack 
problem. 



Theorem 3. For the knapsack problem with n > 3, 
the consecutive rollout algorithm satisfies 



the gain Z* obtained by running a single iteration of 



E 



iei 



> 



-26 



59n 151 

288™ - 864 



0.175. 



(10) 



The bound is plotted with simulated values in Figure [2] Again the bound is tight for n = 3 with a gain of 

1 The technical condition for this property to hold is that the base policy/algorithm is sequentially consistent, as defined in 
[2]. It is easy to verify that the blind greedy algorithm satisfies this property. 
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Algorithm 3 Exhaustive rollout 



Input: Ordered set /, capacity b. 
Output: Feasible solution set S, value U. 

l: Initialize solution set S <— 0, remaining item set I <— I, remaning capacity b <— b, and value ?7 -< — 0. 

2: for t = 1 to n do 

3: for each item in remaning item set, i G I do 

4: Estimate the value of inserting item i before remaning items, {Si, Ui) = 

BlindGreedy( ({<}; J_<) , 6). 
5: end for 
6: if maxi Ui > then 

7: Determine the item giving maximum estimated value, i* argmax^ Ui. 

8: Move item i* from the remaining item set to the solution set, S <— S U {i*}, I <— I\ {i*}- 

9: Update remaining capacity, b 4— b — Wi, and value, U <— U + pi. 

10: end if 
11: end for 
12: Return S, U. 




Figure 2: Performance bounds and simulated values of the expected gain Z* after running a single iteration 
of the consecutive rollout algorithm on the knapsack problem. The bound corresponds to Theorem [3] For 
each n, the mean gain is shown for 10 5 simulations. 



The exhaustive rollout algorithm is shown in Algorithm [3j It takes as input an ordered set of items I and 
capacity b. Again, the ordered set does not imply that the items have been sorted. At each iteration, indexed 
by t, the algorithm considers all items in the available set I. It calculates the value obtained by moving each 
item to the front of the sequence and applying the greedy algorithm. This is shown with the given notation: 
J_j simply indicates the ordered set I with item i removed, and ({«}; I—i) indicates the concatenation of 
item i with the ordered set I—i. The algorithm then adds the item with the highest estimated value (if it 
exists) to the solution. We implicitly assume a consistent tie-breaking method, such as giving preference to 
the item with the lowest index. The next iteration then proceeds with the remaining set of items. 



G 




Figure 3: Performance bounds and simulated values of the expected gap V* after running a single iteration 
of the exhaustive rollout algorithm on the subset sum problem. The bounds correspond to Theorem [4j For 
each n, the mean gap is shown for 10 4 simulations. The expected gap of the blind greedy algorithm is | for 
all n. 



We again only consider the first iteration, which effectively tries using the greedy algorithm after moving 
each item to the front of the ordered set, and takes the best of these solutions. This gives an upper bound for 
the subset sum gap and a lower bound on the knapsack problem gain following from additional iterations. 
Letting V* denote the gap obtained after a single iteration, we have the following bounds for the subset sum 
problem. 

Theorem 4. For the subset sum problem, the gap V* after running a single iteration of the exhaustive 
rollout algorithm satisfies 



1 



n-2 



1 - 

E 



6 + m 



n(2 + n) n 3(3 + to)(4 + to) 

x ' m— 



< E 



V* 



J2 W i >B 



iei 



< 



1 1 ^ 2 9 + 2to 

n(2 + n) n ^ 3(3 + to) (4 + to) ' 



(11) 



Slightly looser bounds are obtained that do not involve sums. 

Corollary 1. For the subset sum problem, the gap V* after running a single iteration of the exhaustive 
rollout algorithm satisfies 



E 



E 



v* 


^2 Wi > B 


< 




iei 




v* 


^Wi>B 


> 



1 1 . 

+ - log 



n(2 + n) n 

1 1 

n(2 + n) n 



log 



3 + 2n 



2 + n 
3 



5 + 2n 



1/3' 



3 + n 



2/3' 



(12) 
(13) 
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The asymptotic behavior of the expected value follows simply from the above corollary. 



Theorem 5. For the subset sum problem, the gap V* after running a single iteration of the exhaustive 
rollout algorithm satisfies 



lim E 

n— >oo 


V* 








iei 


E 


v* 


Y,Wi>B 

iei 



= o, 



e 



log n 



(14) 
(15) 



A plot of the bounds and simulated results is shown in Figure [3] Again, the upper bound is very close to 
the simulated values and much lower than the performance bound ensured by the greedy algorithm. For the 
knapsack problem, the lower bound on the gain of the rollout algorithm is given as follows. 

Theorem 6. For the knapsack problem, the gain Z* after running a single iteration of the exhaustive rollout 
algorithm satisfies 



E 



n-2 



m=0 



> 1 
1 



^l(m + l)(m + 2) 3 (m, + 3) 



2 2H(n) 
nin + 1) n 2 

[(186 + 472m + 448m 2 + 203m 3 + 45m 4 + 4m 5 ) 



(244 + 454m + 334m 2 + 124m 3 + 24m 4 + 2m b )H{m + 1) 
-(48 + 88m + 60m 2 + 18m 3 + 2m 4 )[ J ff(m + 2)] 2 1 



m+l 

■E 

3=1 



2 (-4 + j - 4m + jm - m 2 - (j + (2 + m) 2 ) H(j) + (j + (2 + m) 2 ) iJ(3 + m)) 
j'(-3+ j - m)(-2 + j - m)(l + m)(2 + m) 



(16) 



In the above expression, H(n) refers to the nth harmonic number, 



3 = 1 



(17) 



The gain is plotted with simulated values in Figure |4j While this bound does not admit a simple integral 
bound, omitting the nested summation term gives a looser but valid bound. 

Corollary 2. For the knapsack problem, the gain Z* after running a single iteration of the exhaustive rollout 
algorithm satisfies 



E 



> 1 + 



2H(n) 



n(n + 1) n 2 



n-2 



I f 1 

— > <^ -, ^ ^ [(186 + 472m + 448m 2 + 203m 3 + 45m 4 + 4m 5 ) 

n ^ \ (m + l)(m + 2) 3 (m + 3) 2 L 

m— v ' 

^(244 + 454m + 334m 2 + 124m 3 + 24m 4 + 2m 5 )H(m + 1) 
-(48 + 88m + 60m 2 + 18m 3 + 2m 4 )[ff(m + 2)] 2 ] } . 



(18) 
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The expected gain approaches unit value at a rate slightly slower than the convergence rate for the subset 
sum problem. 

Theorem 7. For the knapsack problem, the gain Z* after running a single iteration of the exhaustive rollout 
algorithm satisfies 




n 



Figure 4: Performance bounds and simulated values of expected gain Z* after running a single iteration of 
the exhaustive rollout algorithm on the knapsack problem. The lower bound corresponds to Theorem [6] and 
the relaxed lower bound is from Corollary [2] For each n, the mean gain is shown for 10 5 simulations. 



1.2 Related work 

Rollout algorithms were introduced by Tesauro and Galperin as online Monte-Carlo search techniques for 
computer backgammon |17j . The application to combinatorial optimization was formalized by Bertsekas, 
Tsitsiklis, and Wu [2]. They gave conditions under which the rollout algorithm is guaranteed to perform 
as well as its base policy, namely if the algorithm is sequentially consistent or sequentially improving, and 
presented computational results on a two-stage maintenance and repair problem. The application of rollout 
algorithms to approximate stochastic dynamic programs was provided by Bertsekas and Castahon, where 
they showed extensive computational results on variations of the quiz problem [3]. Rollout algorithms have 
since shown strong computational results on a variety of problems including vehicle routing, multidimensional 
knapsack problems, fault detection, and sensor scheduling [TU [5J [TBI E] ■ 

Beyond simple bounds derived from base policies, the only theoretical results given explicitly for rollout 
algorithms are average-case results for the breakthrough problem, and worst-case results for the 0-1 knapsack 
problem [UE]. For the breakthrough problem, the objective is to find a valid path through a directed binary 
tree where some edges arc blocked. If the free (non-blocked) edges occur with probability p, independent of 
other edges, a rollout algorithm has a O(N) larger probability of finding a free path in comparison to the 
greedy algorithm [3J. Performance bounds for the 0-1 knapsack problem were recently shown by Bertazzi 
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p[j. He demonstrated that from a worst-case perspective, running a single iteration of a rollout algorithm 
with a conventional greedy algorithm improves the approximation guarantee from | (bound provided by the 
base policy) to |. 

The work of Bertazzi suggests a close relationship between rollout algorithms and partial enumeration 
methods, which are used to obtain polynomial time approximation schemes (PTAS) for combinatorial op- 
timization problems [TDJ |T3] . Multiple step lookahead rollout algorithms, in the context of knapsack-type 
problems, try adding all combinations of items with cardinality at most £ before applying the base algorithm 
on the remaining itema^] This takes place during each iteration of the rollout algorithm. The same procedure 
takes place with partial enumeration algorithms, but this process occurs only once and the algorithm finishes. 
Thus, running a single iteration of a rollout algorithm with lookahead length i is equivalent to running a 
partial enumeration algorithm with enumeration size t. A simple example is the classical H e algorithm, 
which gives a PTAS for the knapsack problem [TU] . 

An early study of probabilistic analysis for the subset sum problem was given by d'Atri and Puech [8]. 
Using a discrete version of the model used in this paper, they analyzed the expected performance of greedy 
algorithms with and without sorting. They showed an exact probability distribution for the gap remaining 
after the algorithms and gave asymptotic expressions for the probability of obtaining a non-zero gap. These 
results were refined by Pferschy, who gave precise bounds on expected gap values for greedy algorithms 12 . 

Perhaps the most extensive analysis of greedy algorithms for subset sum was given by Borgwardt and 
Tremel [6]. They introduced the continuous model that we use in this paper and derived probability dis- 
tributions of gaps for a variety of greedy algorithms. In particular, they showed performance bounds for 
a variety of prolongations of a greedy algorithm (Greedy-Split), where a different algorithm is used on the 
remaining items after the critical item is encountered. They also analyzed cases where items are ordered by 
size prior to use of the greedy algorithms. 

Regarding probabilistic knapsack problems, Szkatula and Libura investigated the behavior of greedy 
algorithms, similar to the blind greedy algorithm used in this paper, for the knapsack problem with fixed 
capacity. They found recurrence equations describing the weight of the knapsack after each iteration and 
solved the equations for the case of uniform weights [T3] . In later work they studied asymptotic properties 
of greedy algorithms, including conditions for the knapsack to be filled almost surely as n — > oo |16j . 

There has been some work on asymptotic properties of the decreasing density greedy (DDG) algorithm. 
The DDG algorithm takes the best of two solutions: the one obtained by adding items in order of nonin- 
creasing profit to weight ratio until an item cannot be packed, and the solution resulting from adding only 
the item with highest profit. Diubin and Korbut showed properties of the asymptotical tolerance of greedy 
algorithms, which characterizes the deviation of the solution from the optimal value [S]. Similarly, Calvin 
and Leung showed convergence in distribution between the value obtained by the DDG algorithm and the 
value of the knapsack linear relaxation [7] . 



2 Blind Greedy Algorithm 

In this section we analyze properties of solutions selected by the blind greedy algorithm, shown in Algorithm 
[IJ Since the algorithm behavior does not directly depend on the item weights and/or values (e.g. it does 
not do any sorting) , all of the properties described here hold for both the subset sum problem and knapsack 
problem. 

Previous work on this model has demonstrated that the critical item index is uniformly distributed on 
{1,2, ... ,n} [5] for cases of interest. In addition to this property, we show that the probability of a given 
item being critical is independent of weights of all other items. 

Lemma 1. For each item k = 1, . . . , n; for all sets of items S C I \ {k} and all weights ws, 

F{K = k\W s = w s ) = ^. (21) 

2 In this paper, we only analyze algorithms with lookahead length I = 1 
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Proof. Assume that we are given the weights of all items Wj = wj. We may divide the interval [0, n] into 
n + 1 segments as a function of item weights: 



°>Z 



i=l 



Z wi >Z 

.2 = 1 1 = 1 



Z wi 'Z 

_»=1 2 = 1 



ir, 



Z w »Z 

.1=1 1=1 



Z wi ' 



(22) 



.i=l 



The probability of item A; being critical is the probability that -B intersects the fcth segment. Since £? is 
distributed uniformly over this interval, we have 



Wk 

n 



(23) 



¥(K = k\W! = Wl 
showing that this event only depends on w k . Accordingly 

¥{K = k\W s = w s ) = [ V(K - k\Wj = Wl )f Wk (w k )dw k = - [ w k dw k - ^. (24) 



Corollary 3. 



¥(K = k) = < 



1 

2^ 
1 



fc = 1, . . . ,n, 
fc = 0. 



□ 



(25) 



An important property of this stochastic model, which we use throughout the rest of our development, 
is that conditioning on the critical item index only changes the weight distribution of the critical item; all 
other item weights remain independent and identically distributed on U[0, 1]. 

Lemma 2. For any given K — k > 0, the weights of items I\{K} are independent and identically distributed 
on W[0,1]. 

Proof Fix a k e {1, . . . , n}. Consider any subset of items S C I \ {fc}. Using Bayes' theorem, the joint 
density for Ws is given by 



fw s \K(w s \k) 



¥{K = k\W s = w s ) 



fws(ws) = fw s (ws), 



¥(K = fc) 

where we have used the result of the previous lemma. This holds for all k = 1, . . . , n. 



(26) 



□ 



Lemma 3. Independent of the critical item K > 0, the probability density function of the critical item weight 
satisfies 

fw K {w k ) = 2w fe , 0<w<l. 



Proof Again fix any k £ {1, . . . , n}, we have 



f , | M P(if = k\W k = w k ) w k /n 
JW k \K(W k \k) = n( - _ ^ fw k (Wk) = T77WZ\ = 2w k- 



¥(K = fc) 



l/(2n) 



We may now analyze the gap obtained by the blind greedy algorithm, given by 

K-l 



G = B - ]T W t . 



(27) 

(28) 

□ 

(29) 



i=l 



for K > 0. 
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Theorem 8. Independent of the critical item K > 0, the probability distribution of the gap obtained by the 
blind greedy algorithm satisfies 



G<g 



Wi > B j = 2g 



E 



G 



ie.i 



g\ o< 5 <i, 
_ 1 

~ 3' 



Proof. For any fixed K = k > and any Wj = wi, the posterior distribution of B satisfies 

fB\Wi,K(b\wi,k) = U 

and thus 



fe-l k 

i=l i=l 



fG\w k ,K(g\w k ,k) = U[Q,w k ]. 



Using the distribution for W k , we have 



fo\K(9\k)= [ fG\w k ,K{g\wk,k)f(w k )dw k = [ —2w k dw k = 2-2g, 
Jo Jg Wk 



where we have used that G < W k . We then have 



P(G < g\K > 0) = f (2 - 2g')dg' = 2g - g 2 , 
Jo 

E[G|/f > 0]=J\& -2g)dg= 1 -. 



(30) 
(31) 

(32) 
(33) 
(34) 

(35) 
(36) 



This serves as a simpler proof of the theorem from [5] ; their proof is likely more conducive to their analysis. □ 

3 Consecutive Rollout 

In this section we show proofs for the performance bounds of consecutive rollout algorithm on the subset 
sum problem, followed by the knapsack problem. The consecutive rollout algorithm, shown in Algorithm [2] 
considers items one at a time and decides whether or not to add the item to the knapsack. We only consider 
the effect of the first iteration of the algorithm. The resulting algorithm takes the best of two solutions: the 
greedy solution and the greedy solution obtained after removing the first item. 

For both problems, we perform a case analysis based on the index of the critical item. The main proof 
technique is to consider the contribution of only a few items to the performance. The subset sum problem 
admits a graphical interpretation that simplifies the analysis. To reduce notational clutter, we often write 
conditional probabilities of the form P(-|JT = x, Y = y) as V(-\x,y). The corresponding random variables 
should be clear from context. We use C k to denote the event that item k is critical. 

3.1 Subset Sum Problem 

Let V be the gap obtained by the blind greedy algorithm after removing the first item, and define V* = 
min(V,G). We refer to V and V* as the drop gap and minimum gap, respectively. We are interested in 
finding E[V*| 5^ jg j Wj > B] as a general function of n. The following lemmas show expected gap values 
conditioned on all possible critical items. 
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Lemma 4. For K = n, the expected minimum gap satisfies 

E[V*\K = n] 



1 



(37) 



Proof. Figure [5] shows V as a function of Wi, w n , and g. Since W\ has distribution 14[0, 1], we may simply 
take the total length of the bold regions to find ¥(V > v\C n ,w n , g). Thus for v < g, 

P(V > v\C n , w n , g) = (w n - g) + (1 - w n + g - v) = (1 - v), (38) 

where we have used that 1 — w n + g — v is nonnegative since v < g and w n < 1. To find the probability of 

V 




U'l 



K - a) 



(l — w n +g-v) 



Figure 5: Gap V as a function of wi, w n , and g resulting from the removal of the first item, assuming that 
the last item is critical (K = n). The function starts at g and increases at unit rate until w\ = w n — g, where 
it drops to zero, and then continues to increase at unit rate. The probability of event v < V is given by the 
total length of the bold regions, assuming that v < g. 



the event V* > v, we note that the events V > v and G > v are conditionally independent given G = g, so 



F(V>v,G>v\C n ,w n ,g) - (l-v)l{v<g). 



Marginalizing over G gives 

¥(V >v,G>v\C n: w n ) 



V(V > v,G > v\C n ,w n ,g)f G \c n ,w n (y\ c n,w n )dg 
1 

(l-v)-dg 



(w n - v)(l - v) 



Noting the distribution of the critical item, 
V(V > v,G > v\C n ) = 



V(V > v,G > v\C n ,w n )f Wn \ Cn (w n \C n )dw Tl 



1 (w n - «Kj. - v) 

= l-3v + 3v 2 -v 3 . 
Finally, using the fact that V* is nonnegative, 



2w n dw n 



nV*\C n ] = / P(V* >v\C n )dv 
Jo 



(1 - 3v + Zv A - v A )dv 



(39) 



(40) 



(41) 



(42) 
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□ 

Lemma 5. For 2 < K < n — 1, the expected minimum gap satisfies 

i <E[V*\2<K <n-l] < ^. (43) 

Proof. We show that the property holds for every fixed K = k with 2 < k < n — 1. The drop gap V for this 
case is shown in Figure [6j where is a function of W\, w k , w k+ \, and g. We again use the property that W\ 
has distribution U[0, 1]. 




Figure 6: Gap V as a function of wi, w k , Wk+ii an d g resulting from the removal of the first item, assuming 
that K — k with 2 < k < n — 1. The function starts at g and always increases at unit rate, except at 
w\ = Wk — g and w\ — w k — g + Wfc+i, where the function drops to zero. The probability of event v < V is 
given by the total length of the bold regions, assuming that v < g. Note that in the figure, w k — g + w k+ i < 1 
and the second two bold segments have positive length; these properties do not hold in general. 

Let C be the event that 2 < K < n — 1. Accounting for all possible values of Wk, Wk+i, and g, we have for 
v < 9, 

P(V > v\C,g, w k , w fe+ i) < (u) fe - g) + (w k+1 - v)+ + (1 - w k +g - w k+1 - v) + 

-(w k - g + w k +i - 1)+ 
^ P u (V>v\C,g,w k ,w k+1 ). (44) 

The first three terms come from the three bold regions shown in Figure [6] We have assumed that v < g, so 
the length of the first segment is always w k — g. For the second term, it is possible that v > w k +i, so we 
only take the positive portion of w k +\ — v. Taking only the positive portion of the third term is necessary 
for the cases where (1) item k + 1 does not become feasible (w k — g + Wk+i > 1) and (2) if it is feasible, 
where v is greater than the height of the third peak (v > 1 — w k + g — w k+ i). 

The last term is required for the case where item k + 1 does not become feasible, as we must subtract 
the length of the bold region that potentially extends beyond Wi = 1. Note that we always subtract one 
since it is not possible for the point where the second peak intersects V — v to be greater than one. To see 
this, assume the contrary so that (v + w k — g > 1). Since w k < 1, this would imply that g < v, violating our 
assumption. 

Finally, the expression is an inequality because if item k + 1 becomes feasible, it is also possible for item 
k + 2 (if it exists) to become feasible. Such an event would yield four peaks, where the lengths of the last 
two components would be less than or equal to (1 — w k + g — w k +i — v). The same reasoning holds if even 
more items become feasible. 
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For the lower bound, we must remove the third term in (44 1 because it is possible for many small peaks 
(i.e. with heights below V — v) to be present in the region Wk — g + Wk+i < W\ < 1, corresponding to 
additional items becoming feasible. 

P(V > v\C,g, w k , w k +i) > (wk-g) + (wk+i-v) + -(wk-g + Wk+i-l)+ 

4 Fi(V >v\C,g,w k ,w k+1 ). (45) 

Now considering both V and G gives the bounds 

P{V > v,G> v\C,g,w kl w k+1 ) < P U {V > v\C,g,w k ,w k+1 )l(v < g) 

± ¥ U (V >v,G>v\C,g,w k ,w k+1 ). (46) 

¥{V >v,G>v\C,g,w k ,w k +x) > Pi(V > v\C,g,w k ,w k+1 )I(v < g) 

± Pi(V >v,G>v\C,g,w k ,w k+1 ). (47) 

Marginalizing over w k +%, which has uniform density according to Lemma [2] gives 



P(V > v,G > v\C,g,w k ) = / P{v > v,g > v\C,g,w k , w k+1 )f Wk (wk+i)dw k + 



(i 



fi 

< / fu(v > v,g > v\C,g,w k ,w k+1 )f Wk+1 {w k+1 )dw k+1 
o 

(w k - g) + / (wfe+i - v)dw k+1 (w k - g + w k+1 - l)dw k+1 

Jv Jl+g-w h 
l-w k +g+v \ 

(1-Wk+g— w k+1 - v)+dw k+1 > l(v < g) 
o J 

(w k -g) + ^0-- v) 2 -\{wk- gf 
+i(i-«j fc + 5-«)+}i(« <g) 

± P u (V>v,G>v\C,g,w k ). (48) 

P(V > v,G > v\C,g,w k ) > / Pi(v > v,g > v\C,g, w k ,w k+1 )f Wk+1 {'w k+ i)dw k+1 

Jo 

(w k -g)+ / (w k +i - v)dw k+1 

J V 

{w k - g + w k+ i - l)dw k +i > l(v < g) 

l+g—Wk ) 

K ~g) + \{l-v) 2 - \{w k - ,g) 2 | I(v < g) 
P l (V>v,G>v\C,g,w k ). (49) 
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We note again that f G \c,w k w k ) = U(0, w k ). Thus, 

¥(V>v,G>v\C,w k ) = / nV>v,G>v\C,g,w k )f Gl c !Wk (g\C,w k )dg 

Jo 



< / P U {V > v,G > v\C,g,w k )f GlG<Wk (g\C,w k )dg 
Jo 

- f{ 



{wk~g) + \{l-v) 2 - l -{w k -g) 2 



1 1 i 

+ ^~w k + g-v) + > — dg 

2 ) w k 

v 2v 2 v 3 vw k 

= l-2v 1 1 

w k w k 2w k 2 

^ ¥ u (V>v,G>v\C,w k ). 



V(V > v,G>v\C,w k ) > / F^V >v,G>v\C,g,w k )f GlcWk (g\C,w k )dg 

Jo 



n 



(w k -g) + l(l- vf -\{w k - g) 2 \ —dg 
I I J w k 

,3 



1 _ 2v v_ 3w_ _ jr_ Wk_ vm. _ Wk_ 

2 2w k 2w k 3w k 2 2 6 



^ F t {V> v,G> v\C,w k ). 

Finally, we integrate over w k . 

P(V>v,G>v\C) = f P(V >v,G>v\C,w k )f Wk {w k )dw k 

J V 



< 



V 



J „(V > v,G > v\C, w k )f Wk (w k )dw k 

v 2v 2 v 3 vw k \ „ 

l-2v 1 h — - 2w k dw k 

w k w k 2w k 2 J 

llv „ 2 o , 2w 4 
t {V*>v\C). 



V(V>v,G>v\C) > I Pi(V>v,G>v\C,w k )f Wk {w k )dw k 

D 

1,1 / 1 11 'in,' 2 ' i>3 „,,, -Mill, n,, 2 ' 



/ - - 2v 1 1 - H ^ 2w k dw k 

J v \2 2w k 2w k 3w k 2 2 6 J 



3 8v 7v 2 „ , 5u 4 
1 2v H 

4 3 2 12 



= P,(F* > v|(7). 
Since t^* is nonnegative, we have 



r 1 _ r 1 _ r 1 / ii w o?; 4 

E[V*|C]= / P(F* > v\C) dv < / P u (y* > v\C) dv = / 1 - — + 5v 2 - 3v 3 + — ) dv 
Jo Jo Jo V 



3 
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E[y*|C]>/ P ; (y*>«|C)d«= / f-- y + _-2z; 3 + — J dt; = (55) 



□ 



Lemma 6. For K = 1, the expected minimum gap satisfies 



— <E[V*\d] < — . (56) 
60 — 30 y ' 

Proof. We use a different approach when the first item is critical since W± no longer has a uniform distribu- 
tion. Define the drop critical item as the item that is critical after the first item is removed. Define D 2 and 
D 3 as the events where the second and third items are the drop critical items, respectively. Also let D± + be 
the event where an item with index i > 4 is critical (assuming n > 4) or there is no drop critical item (if 
n — 3). Based on the different cases for the subcritical items, the minimum gap V* is given by the values 
shown in Table [TJ 



Table 1: Minimum gap values when the first item is the critical item (C\). 



Case 


Defining inequalities 


Minimum gap bound 


Da 
D 3 


W 2 > G 
W 2 <G,W 2 + W 3 >G 
W 2 + W 3 <G 


V* = G 
V* = G-W 2 
V* <G -W 2 ~W 3 



We begin by finding some necessary distributions for the cases. For case D 3 , the posterior distribution 
of W 2 is needed. 



fwACuD^G^lCuD^g) = f(w 3 \C 1 ,W 2 <G,W i + W 3 >G ) g) 

V(W 2 < G,W 2 + W 3 > G|d, 5 , W2 )/ W2 (w; 2 ) 



(57) 



V(W 2 < G, W 2 + W 3 > G\d,g) 
where we have used that fw 2 \c 1 ,G( w 2, Ci, 9) — fw 2 ( w 2) =^[0, 1] by Lemma[2j For the numerator, we have 
P{W 2 <G,W 2 + W 3 > G\Ci,g, w 2 ) = (l-g + w 2 )I(w 2 < g). (58) 
Integrating over W 2 gives 

3\Ci,g) = f P(W 2 <g 7 W 2 + W 3 >G\C 1 ,w 2 )f W2 (w 2 ) 
Jo 

g 

(1 - g + w 2 )dw 2 







ff-y- (59) 



Returning to the posterior distribution of W 2 , 



fw 2 \c 1 ,D 3 M w 2\C : i,D 3 ,g) = — — = — ^ _ — , 0<w 2 <g, (60) 



P(W 2 < W2 \C 1 ,D 3 ,g)= ( 2 -* 9 + ^ W \ 0< W2 <g. (61) 



Moving to the case £> 4 +, let W = W 2 + W 3 ; 



P(£> 4+ |Ci, 5 ) = P(W < g\C u g) = y , (62) 
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where we have used that the distribution of W conditioned on the first item being critical is the distribution 
for the sum of two independent uniform random variables (via Lemma [2]). Now for the posterior distribution 
of W<i + W 3 , we have 

t i n n n a fw'\C±{w'\Cx) 2w' 

fw\c 1 ,D i+ ,G{ w \ c uD4+,9) = -^tjz — r = — > 0<w<g. (63) 

The cumulative distribution function is then 

w' 2 

F(W <w'\C 1 ,D 4+ ,g) = 0<w'<g. (64) 

9 

We may now find distributions for the minimum gap V* conditioned on all cases for the drop critical 
item. For case D 2 , it is clear that V* = G, and 



i\C 1 ,g)=F(W 2 >g) = l-g. (65) 
For D3, we have 

¥(V* >v\C 1 ,D 3 ,g) = ¥(G-W 2 >v\C 1 ,D 3 ,g) 
= ¥{W 2 <G-v\C 1 ,D 3 ,g) 
(2 - 2g + (g ~ v))(g ~ v) 

(2 - g)g 

(2 — q — v)(q — v) , , 

Note that the total weight packed, W(D4 + ), satisfies 

W(D i+ ) >W 2 + W 3 = W. (67) 

The upper bound on probability is 

V(V* >v\C 1 ,D 4 +,g) = F(G-W(D 4+ )> v\C u D 4+ ,g) 
= P{W(D A+ ) <G~v\C 1 ,D 4+ ,g) 
< P(W <G-v\C 1) D 4+ ,g) 
(9 - vf 



fJ 2 



, < v < g. (68) 



It is only possible to guarantee a trivial lower bound, 

P(V*>v\C u D A+ ,g)>0. (69) 

Considering all three cases, we have 

F(V* >v\C 1 ,g) = V(V* > v\C 1 ,D 2 ,g)¥(D 2 \C 1 ,g)+¥(y* >v\C 1 ,D 3 ,g)P(D a \C 1 ,g) 
+ V(V* > »|Ci,23 4+ , 5 )P(I> 4+ |C , i,fl) 



(1 -v-gv + v 2 )I(v < g). (70) 



nV*>v\ Cl ,g) > <^-9) + {2 - a [2 ^- V) ( g -g)}l(„< g ) 

2 2 \ 

!(«<«?)• (71) 
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The expected value is then bounded by 

E[V*\C u g}< [ 9 (l-v-gv + v 2 )dv = g 
Jo 

E[V*\C u g] > J" (l 



2 



6 ' 



g 2 , v2 \j 9 2 

v + ~tt ] av = q — 

2 2 y 2 



Finally, integrating over G, 



9 2 


g 3 


2 


6 






g 2 




2 





(2 - 2 5 )d ff 
(2 - 2 9 )d 5 : 



30' 

13 
60' 



E[V*\C\] < [ E[V*\C 1 ,g]f G (g) = [ 
Jo Jo 

nV*\G\}> [ E[V*\C 1 ,g]f G (g)= [ 
Jo Jo 

We now arrive at the bound for the subset sum problem. 

Theorem 2. For the subset sum problem with n > 3, the gap V* obtained by running a single iteration of 
the consecutive rollout algorithm satisfies 



(72) 
(73) 

(74) 

(75) 
□ 



4 + 5n 
30n 



< E 



V* 



iei 



< 



3 + 13n 
60n 



(76) 



Proof. The events C\, C, and C n form a partition of the event $Z ieJ Wi > B, giving 



E 



V* 



= E[F*|Ci]P(Ci) +E[V*\C]P(C) +E[V*\C n ]P{C n ) 

3 + 13n 



< L (I) 15 ( n ~ 2 ) - ( - 

~ 30Vn/ + 60\ n ) + 4 I n 



60n 



The lower bound is found similarly. 



(77) 

□ 



3.2 Knapsack Problem 

We proceed similarly for the knapsack problem, bounding the gain of the rollout algorithm conditioned on 
the critical item index. 

Lemma 7. For K = n, the expected gain Z satisfies 

E[Z\K = n}= 1 -. (78) 

Proof A gain is only obtained in the case where the last item becomes feasible when removing the first. 
Consistent with our previous notation, let D n+ i denote the event that item n becomes feasible when the 
first item is removed. The probability is given by 

F(D n+1 \C n ,g,w n ) =P(wi > w n -g\C n ,w n ,g) = (l-w n + g). (79) 
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Integrating over G and W n gives 

/•W n -j 

¥(D n+1 \C n ,w n ) = (l- Wn + g )—dg = l--±, (80) 

Jo w n 2 

P(I>n+l|C n ) =J( 1 ~ ^ 2w ^W n = ^ ( 81 ) 

Now assuming that item n becomes feasible, we are interested in the case where it provides a larger value. 
This is simply given by the probability 

nPn>Pl)=\- (82) 

Conditioned on the event that item n provides a larger value, the distribution for the gain is of interest. 

W ^ , ^ X P(0<P»-P1 <g) 

P( Pn - pi < > Pi) = p(pn > pi) ■ (83) 



For the numerator, 

P(0 < Pn - Pi < q) = / / dPndpi + / dp n dpx = q - — , 



(84) 



which gives 

V{Pn -Pl< q\Pn > Pi) - 2q - q 2 , (85) 
^\Pn-Pl\Pn>Pl] = \- (86) 



Finally, we have 



E[Z\C n ]=E\p n -p 1 \p n >p 1 }¥(p n >p 1 )¥(D n+1 \C n ) = 1.1.1 = 1 (87) 



□ 



Lemma 8. For 2 < K < n — 1, the expected gain Z satisfies 



59 

nZ\K = n]> — c± 0.205. (88) 

Proof. We again let C be the event that 2 < K < n — 1. Wc fix K = k, and the proof holds for all valid 
values of k. In this case, it is possible that removing the first item allows for the critical item to be feasible 
as well as additional items. We are only guaranteed the existence of one item beyond the critical item. Let 
D k+ i indicate the event that item k + 1 becomes the drop critical item, and let D( k+2 ) + be the event that 
an item with index i > k + 2 becomes the drop critical item. If such an item does not exists, then the event 
means that all items are packed. 

The event D k +i indicates that only item k becomes feasible when removing the first item. For the 
probability of this event, we have 

F(D k+1 \C,g,Wk,w k +i) = w k+ i - (w k - g + w k+1 - 1) + . (89) 

Similarly, for the event Dr k+2 \ + , we have 

P(D( k+2 )+\C,g,w k ,w k+1 ) = (1 - w k +g - w k +i)+- (90) 
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Starting with event D k+ i, we integrate over W k+ i, which has uniform density. 



F(D k+1 \C,g,w k ) 



V(D k+ i\C, g, w k ,w k+ i)fw k+1 (w k+1 )dw k+1 
/ w k+1 dw k+1 - I (w k - g + w k+ i - l)dw k+ i 

Jo J l+q—Wk 



1 

2 

Marginalizing over G gives 

F(D k+1 \C,w k ) 



9 W Z 



f{D k+1 \C,g,w k )f GlWk (g\w k ) 



g wt 



w k 



-dg 



1 wl 



Finally, 

F(D k+1 \C)= [ ¥(D k+1 \C,w k )f Wk (w k ) 
Jo 

Now for the event D( k+2 )+, we integrate in the same order. 
V(D {k+2)+ \C,g,w k ) 



1 wi 



2w k dw k 



1 1 



= / ¥(D (k+2)+ \C,g,w k ,w k+1 )fw k+1 (w k+1 )dw k+1 
Jo 

= / (l-wk+g- w k+1 )dw k+1 

Jo 



1 g wt 

2 + .9 + y ~ w k - 9 w k + -y 



(91) 



(92) 



(93) 



(94) 



P(£> (fe+2)+ |C,«; fe ) 



P(-D(fe+2)+|C, .9, w fe )/ G | Wk (g\w k )dg 



2 + .9 + y - wfc - gw k + — 



w k 



-dw k 



l_w k wl 
2 2 6' 



P(£> (fe+2 ) + |<?) - f^nD ( k + 2) + \C,w k )f Wk {w k )= f^--^ + ^j2w k dw k = J. 



(95) 



(96) 



Equipped with these probabilities, we now consider the gain from the rollout for the different drop critical 
item cases. For the case where only one item becomes feasible (D k+ \), the analysis for the previous case 
holds, so we have 



E[ Pn - Pl \C 1 D k+1 }=E[ Pn - Pl \ Pn >p 1 ]P(p n >p 1 )F{D k+1 \C) = 



11 5 5 
3 ' 2 ' 12 ~~ 72' 



(97) 



If two or more items become feasible (Dr k+1 \ + ), we only consider the gain resulting from adding two items, 
and this serves as a lower bound for the case of more items becoming feasible. Accordingly, define 



P' = P k +P k+1 . 



(98) 
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The probability that the profits of the two items are greater than p\ is given by 

Hp' > Pi) = 1 - W < Pi) = 1 - J P'dp'dp! = i-J*£_ rJ ( . 
The conditional gain is given by 



Pl d Pl = l (99) 



P(P - Pr < q \P> > Pl ) = ^^l^ - (1 ° 0) 
Proceeding with the numerator and assuming < q < 1, 

rl-q rpi+q rl rl rl rPl+q 

P(0 < P' - Pi < q) = / / p'dp'd Pl + / / p'dp'dp! + / / (2 - p')dp'dpi 

Jl — qJpi Jl — qJl 



1-9 



2/ A-A 2 2 . 

3 p 2 q 2 s 

2 + 2 Pi - y + 2< ? -Pi9 - y ) dpi 



Now for 1 < g < 2, 



! + 0< ff <l. (101) 



nl /■2-q rpi+q /■! 

p'dp'dpi+ / / (2 - p')dp'dpi + / / (2-p')dp'dpi 



Jpi JO J\ Jl-qJ\ 

1 ^ , f 2 - q ( 3 , p 2 , g 2 



2 2l dpi+ L [-2 +2 ^-Y +2q - piq -Y ]dpi 



1 +2q-q 2 + Kq<2. (102) 



2 * " 6 
The distribution for the gain is thus given by 



The expected value is 



^-*i'-«^-jf<h^)* + f«(T-T' + !«")* 



13 

20' 



(104) 



Recalling that it is possible for more than two items to be added in the case D( k+2 )+, let P" be the total 
value of items added for the case. We may bound the expected gain as follows, where the event P(Dk\C) is 
omitted since it provides zero gain. 

E[Z\C] = E[P" - P^P" > P 1 ]P(P" > P 1 )P(£> (fc+2)+ |C) +E[P„ - P|P„ > Pi]P(P„ > Pi)P(D fc+ i|C) 
> E[P' - P|P' > P]P(P' > P 1 )P(D (fc+2)+ |C) + E[P„ - P|P„ > P]P(P„ > P 1 )¥(D k+1 \C) 

20 ' 6 ' 4 + 3 ' 2 ' 12 ~~ 288' ^ ' 

□ 
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Lemma 9. For K = 1, the expected gain Z satisfies 



E[Z\K = 1] > — ~ 0.208. 



(106) 



Proof. We use the drop events D 2 , D 3 , and just as we did for the subset sum problem. The event 
probabilities given g are the same as those for the subset sum problem. Accordingly, 



P(02|Ci)= [ nD 2 \C u g)f G (g)dg= [ (1 - g)(2 - 2g)dg 
Jo Jo 

P(£> 3 |Ci)= / ¥(D 3 \C 1 ,g)f G (g)dg = [ 
Jo Jo 



2 

3' 



q 2 \ 1 
9-\){2-2g)Ag=-, 



P(D4+|Ci) 



1 

12' 



The greedy solution gives zero value, so the expected gain is easily determined. 

E[Z\C 1 ,D 2 ] =0, 

E[Z\C 1 ,D 3 ]=E[P 2 ] = ±, 

E[Z\C U D 4+ ] >E[P 2 + P 3 ] = 1. 
Combining all cases for the drop critical item, 

E[Z\d] = E[Z|Ci,£> 3 ]P(I>3|Ci)+E[Z|Ci,D4 + ]P(I>^|C7i) 

> I. i + i. 1 = A. 

- 2 4 12 24 



(107) 

(108) 
(109) 

(110) 

(111) 
(112) 

(113) 

□ 



The result for the knapsack problem is as follows. 



Theorem 3. For the knapsack problem with n > 3, the gain Z obtained by running a single iteration of the 
consecutive rollout algorithm satisfies 



E 



> 



-26 + 59n 
288n 



(114) 



Proof. The events Ci, C, and C„ form a partition of the event ^2 ieI Wi > B, giving 



E 



iei 



= E[Z|Ci]P(Ci) + E[Z\C}V(C) + E[Z\C n ]P{C n ) 



~ 24 In 



59 (n-2 
288 V n 



1 - 

9 V" 



-26 + 59n 
288n 



(115) 
□ 
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4 Exhaustive Rollout 



In this section we analyze the exhaustive rollout algorithm, shown in Algorithm [3] We first show bounds for 
the subset sum problem and then move to the knapsack problem. As in the last section, we only consider 
the performance after the first iteration. The resulting algorithm simply finds the values obtained by first 
inserting each item and then using the greedy algorithm for the remaining items. Of these solutions, the 
algorithm chooses the one with the highest value. We use proof techniques similar to those used for the 
consecutive rollout, where the subset sum problem again has a graphical interpretation. 

Throughout this section, we implicitly assume that Yliei > B. For analyzing the exhaustive rollout, 
we only need to consider cases where items not packed by the greedy algorithm are inserted first. This 
follows since reordering the packed items will always give the same solution as the greedy algorithm. For a 
given critical item K — k, there are n — k + 1 opportunities to reduce the gap of the greedy algorithm (for 
subset sum) or improve the profit from the greedy algorithm (for knapsack) . We will take advantage of the 
fact that all items other than the critical item have weight distributed according to U[0, 1]. 

4.1 Subset Sum Problem 

We begin by considering a greedy solution satisfying K > 2, meaning that at least one item has been 
successfully packed. Denote the weight of the last packed item by A; 

A = W K - U K>1. (116) 

Also let M denote the number of items after the critical item, 

M = n — K. (117) 

We temporarily ignore the critical item and note that there are M items with weights distributed on U[0, 1] 
that provide opportunities to improve over the greedy solution. Consider the gap V, obtained by inserting 
a non-packed item with given weight Wi at the beginning of the sequence. This value as a function of Wi, a, 
and g is shown in Figure [7] 



V 




(g -v) (a- v) 



Figure 7: Gap Vi as a function of a and g resulting from inserting an item with value Wi at the beginning of 
the sequence. The function starts at g and decreases at unit rate, except at Wi = g where the function jumps 
to value a. The probability of the event v < Vi is given by the total length of the bold regions, assuming 
that v < g and g + a — v < 1 . 

We are interested in finding P(V^ > v\G = g,A = a), which we denote by P(Vf > v\g,a). We again 
note that w has a uniform distribution. For the specific case shown in the figure, we have ¥(Vi > v\g, a) = 
(g — v ) + (a — v) as given by the lengths of the bold regions. This requires that v < g and v < a, 
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so the expression becomes P(V* > v\g,a) = (g — v)+ + (a — v) + . We must account for the case where 
g + a — v > 1, requiring that we subtract length (g + a — v — 1), so we revise our expression to P(T^ > 
i>|<7, a) = (g — v) + + (a — v) + — (g + a — v — 1)+. Finally, for the case of g + a < 1, we must take care of the 
region where w G [g + v, 1]. If the greedy solution only packs one item (the item with weight a), this region 
satisfies P(Vi > v\g, a) because inserting the item is infeasible. If there are more packed items, this region 
contributes at most (1 — g — v) to the probability. To upper bound the probability, both cases are handled 
by adding the term (1 — g — v) + . This term is omitted for the lower bound. This gives the upper bound 

F(Vi>v\g,a) < {g - v)+ + (a - v) + - (g + a - v - 1)+ + (1 - g - o)+ 

^ P u (V i >v\g,a), (118) 

and the lower bound 

V(V l >v\g,a) > (g-v) + + {a-v) + -(g + a-v-l)+ (119) 
4 V l (y i >v\g,a). (120) 

For the value obtained by inserting the critical item, we first find the distribution of the critical item 
weight given the gap. As noted in Section [3j we have 

fw K (w K ) = 2w K , 0<w K <l (121) 
f G \w K (9\wK) = U[0,w K ] (122) 
f G (g) = 2-2g, 0<g<l. (123) 

Using Bayes' theorem, we may deduce that 

f i i \ fG\w K (g\w K )fw K {w K ) ^' 2w k 1 7yr , , . 

fw K \G(wK\g) = — = -% — 7> — = 1 =U[g,l\. (124) 

fcW 2-2 ff l-g 

Let V denote the gap obtained by inserting the critical item at the beginning of the sequence. We refer 
to this as the critical item gap. We wish to find P(V > v\g, a). If we still assume that at least one item 
with weight A is successfully packed by the greedy algorithm, we may use the same analysis that we used 
for the non-packed item but restricted to the interval g < wk < 1- Taking the expression for P(V^ > v\g, a), 
removing the (g — v) + term, and normalizing by (1 — g), we have 

P(V>v\g,a) < W-tO + -(ff + o- «-!)+ + a)+} 



Similarly, the lower bound is 



P U (V >v\g,a). (125) 



P(V>v\g,a) > (^ Y ^j{(a-v) + -(g + a-v-l) + } (126) 

4 ^{V >v\g,a). (127) 

We are ultimately interested in the minimum of the greedy gap, the critical item gap, and the gaps 
achieved by the M non-packed items. For now, we fix M = m. Let Vi denote the gap obtained by inserting 
the ith non-packed item. The minimum gap V* is thus defined as 

V* = min(Vi, . . . , V m , V', G). (128) 

Conditioning on the greedy gap value g, the first packed item weight a, and the number of items m, the 
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values in the above term are independent, so we have 

m 

F{V* > v\a,g,m) = P(V > v\a, g)P(G > v\a, g) TJ P(F; > v\g, a) 



> 



Similarly, 



Fi(V' > v\g,a)P(G > v\g, a) fjP,(Vi > v\g,a) 

i=l 

P,(V > v\g,a)¥(G > w|0,a)[P,(V< > v\g,a)] m 
P Z (V* >v\g,a,m). 



¥(V* >v\a,g,m) < P U {V' > v\a,g)F(G > v\a,g)[¥ u (Vi > v\a,g)]' 
± V U {V* > v\g,a,m). 



Marginalizing over A and G gives 



P(^*>«|m) = [ [ V(V* >v\g,a,m)f A (a)f G (g)dadg. 



Jo Jo 

Since the integrand is always positive, we have 



P(^* > v\m) < f 1 f 1 F U (V* > v\g, a, m)f A (a)f G (g)dadg 
Jo Jo 

± V U (V* > v\m), 

P(V*>v\m) > [ [ Pi(V* >v\g,a,m)f A (a)f G (g)dadg 
Jo Jo 

= ¥[{V*>v\m). 



As shown in Section |A.1[ evaluation of these integrals gives the upper bound 



t (V* > v\m) 



V U (V* >«|m)<i v<\ 



V U (V* > v\m) >1 v>\, 



where 



\»(V*>»|m)<i = 7-7^ r{2m(l-2i;) m + m(l-t;) m + 9(l-w) 3+m -12m(l 

- 2 3(3 + m) 

-3m(l - v) m v + 24m(l - 2v) m v 2 + 3m(l - v) m v 2 - 16m(l - 

-m(l-v) m v 3 } , 

\(V* > v\m) > i = ^(l-.) 3+m + 2(1 ,7 )3+T " - 
2 3 3 + m 



The lower bound is 



> «|m) = -(1 - v) 6+m + 



In ^ 3+m , (l-«) 3+m 



3 3 + m 



2G 



To determine expected values, we define the following 

i 

E u [V*\m\ <k 4 [ 2 ¥ U (V* >v\m) < i_dv, (138) 
2 Jo 2 



(139) 



E„[y*|m] > i = / ¥ U {V* > v\m) > idv, 

E u [V*\m] 4 E u [F*|m]<i +E„[y*|m] > i, (140) 

E ; [y*|m] 4 / P ; (T/* > t>|m)dw. (141) 
Jo 

The integral evaluations preserve the inequalities, so we have 

E ( [y*|m] < E[V*\m] < E u [V*\m}. (142) 

Evaluating the terms for the upper bound gives 

_ ..... . 3 3-2- 4 ~ m 2m 2- 4 - m m 

E u [y*M<i = — — r-y^ rr— r + 



E„[F*|m] 



(3 + to)(4 + to) (3 + to)(4 + to) 3(3 + to)(4 + to) 3(3 + to)(4 + to) : 
3 . 2 -4-m 



J> 5 (3 + to)(4 + to) 3(3 + to)(4 + to)' 

(143) 

and thus 

E u [V*\m,C 1 ] = —^^—-, (144) 
3(3 + m)(4 + to) 

where Ci reminds us that this holds for the event where the first item is not critical, which we have not 
explicitly denoted until now. For the lower bound, we have 

E,[Hm, ft] - jf UV > »Md» = jf - »)- + d. - 3(3+ ^;i +mr ("5) 

If the first item is critical, we may obtain an exact expression for the expected minimum gap. We have 

¥(V* > v\d,m) = [ (l-v) m {2-2g)dg = -{l-v) m + 2{l-v) 1+m + (l-v) m v 2 , (146) 

J V 

and 

f 1 1 

E[V*\d} = / P(V* > v\d)dv = , (147) 

Jo 2 + n 

where we have used that the event C\ indicates M = n — 1. 

We now account for the distribution of M, noting that M = n — K. 



W*\ = ^E[V*\K = k]V{K = k) 
fc=i 

1 1 ™ 

= -E[V*\C 1 ] + -Y j E[V*\C k ] 
n n ^ 

k=2 
n-2 

< -E[V*\C 1 } + -YE u [V*\M = m] 



m=0 

1 1 ^ 9 + 2to 



n ^ 9.(9. 4- m\ A 4- m\ v ' 



n(2 + n) n ^ o 3(3 + to)(4 + to) 
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Similarly, 



ra-2 



E[V* 



> 



1 

-E[V* |Ci] + -VE| [V* |M - m] 

m=0 

1 1 x - 6 + m 

n{2 + n) + n ^ 3(3 + to) (4 + to) ' 



(149) 



Throughout all of the analysis in this section, we have implicitly assumed that J2 ieI Wi> B. We have thus 
shown the following. 

Theorem 4. For the subset sum problem, the gap V* after running a single iteration of the exhaustive 
rollout algorithm satisfies 



n—2 



1 1 ^ 

! —I— n 



6 + to 



n(2 + n) n^ 3(3 + ra)(4 + m) 



< E 



V* 



iei 



< 



1 



ra-2 



1 " 



9 + 2m 



n(2 + n) n ^ 3(3 + to) (4 + to) 



. (150) 



The sum terms may be bounded with integral approximations. For the upper bound, the argument of 
the sum is convex in to, so the midpoint rule provides an upper bound. 



ra-2 

E 



9 + 2to 



< 



9 + 2to 



3(3 + m) (4 + to) - y_ i 3(3 + m) (4 + to) 



The lower bound is bounded using the left rule. 



E 



6 + 2m 



> 



6 + 2m 



dm = log 



dm = log 



3 + 2n 
5 



2 + n 



7 

5 + 2n 



3 + n 



1/3' 



2/3' 



(151) 



(152) 



^ o 3(3 + m)(4 + m) " 7 3(3 + m)(4 + m) 
This gives the corresponding result. 

Corollary 1. For the subset sum problem, the gap V* after running a single iteration of the exhaustive 
rollout algorithm satisfies 



E 


v* 


E^ 


> B 






iei 




E 


v* 


E^ 


> B 






iei 





< 



n(2 + n) n 



log 



3 + 2n 
5 

2 + n 



5 + 2n 



1/3' 



3 + n 



2/3' 



(153) 
(154) 



The asymptotic result then follows. 



Theorem 5. For the subset sum problem, the gap V* after running a single iteration of the exhaustive 
rollout algorithm satisfies 



lim E 

n— >oo 


v* 


E^« > B 






iei 


E 


v* 


E^> B 

iei 



0. 



= e 



logn 



(155) 
(156) 
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4.2 Knapsack Problem 

As with the subset sum problem, we start by assuming that at least one item is successfully packed. Let 
the weight of the last successfully packed item be denoted by A and its profit by Q. We consider three 
possibilities for the result of inserting a non-packed item i at the beginning of the sequence. If Wi < g, the 
item is feasible and it does not make any other items infeasible; this event is denoted by A+. The event Ao 
occurs when g < Wi < g + a, meaning that the last packed item becomes infeasible with the insertion of item 
i. Finally, A_ indicates that insertion of item i is either infeasible, or it makes two or more packed items 
infeasible. This occurs when Wi > g + a, which obviously requires g + a < 1. Conditioned on g and a, the 
probabilities of these events are 

P(A+| 5 ,a) = g, 

P(A |s,a) = a-(g + a-l)+, 

P(A_| 5 ,a) - (l-g-a)+. (157) 

Beginning with the observation that conditioned on these events, the gain Zi is independent of g and a, we 
have 

¥{Z t < z\g,a,q) = ¥{Z, < z\A + ,q)¥(A + \g,a) +¥(Z l < z\ A , g)P(A |<?, a) + ¥(Z t < z|A_, g)P(A_| ff , a) 
< ¥{P t < z)¥(A+\g, a) + ¥(P t -Q< z\q)¥(A \g, a) + P(A_ \g, a) 
= z¥(A+\g, a) + min(z + q, l)¥(A \g, a) + P(A_ \g, a) 

= ¥ u (Z t <z\g,a,q). (158) 

We may perform a similar analysis for the case of inserting the critical item at the beginning of the sequence. 
Let Aq be the event that inserting the critical item at the beginning of the sequences makes only the last 
packed item infeasible, and let A'_ indicate the event that it is not feasible to insert this item, or that doing 
so makes two or more packed items infeasible. Using knowledge of fw K \G{'\') gives 

P(A' | 5 ,o) = ^{0-^ + 0-1)+}, 

¥(A'_\g,a) = ^{(1-5-0)+}. (159) 

Furthermore, let Z' denote the gain obtained by inserting the critical item. 

¥(Z' <z\g,a,q) = ¥{Z' < z\A' , q)¥(A' \g, a) + ¥{Z' < z\A'_, q)¥(A'_\g, a) 
< F(P> -Q< z\q)¥(A' Q \g, a) + P(A'_ \g, a) 
= mm(z + q, l)¥(A' \g, a) + P(A'_ \g, a) 

= ¥ U (Z' <z\g,a,q). (160) 
Let Z* denote the maximum gain achieved over m non-packed items and the critical item. We have 

m 

¥(Z* <z\g,a,q,m) = ¥(Z' < z\g, a, q) [] ¥(Z t < z\g, a, q) 

i=l 

m 

< Pu(Z' < z\g, a, q) Y[ ¥ u {Z t < z\g, a, q) 
i=i 

- [min(z + g ,l)P(A[ ) |.g,a)+P(A'_|5,a)] [z¥(A+\g,a) 

+ mm(z + q, l)¥(A \g, a) + P(A_ \g, a)] m 
= ¥ U (Z* <z\g,a,q,m). (161) 
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We use the following shorthand notation 



P+ 


= P(A+|5,o), 


Po 


= P(Ao|ff,a), 


p_ 


= P(A_| 5 ,o), 


p' 


= P(A M, 


P'_ 


= P(A'_| 5 ,o). 



(162) 



Integrating over Q, which has uniform density, gives 

¥(Z* < z\g,a,m) = [ ¥(Z* < z\g,a,q,m)dq 
Jo 



rl 

< / V U (Z* < z\g,a, qi m)dq (163) 



/ [(z + q)P^ + P'_] [zP+ + (z + q)P + P_] m dq 
JO 

+ [ [P *+PL]{zP + +Po + P-} m dq 
Jl-z 



(P \ + PL)(Po + P-+P+z) m z 



-r._ A . -r._-r. + ^ * - (m + + 2) p2 

{(P + Pn + P+z) m+1 [P P^m + 1) + P a P'_{m + 2) - P^P_ - P^P+z] 
-[P_ + (P + P+)z] m+1 [(2 + m)P P'_ - P_P^ + P^Po + mP - P + )z]} 
= F U (Z* < z\g,a,m). (164) 

At this point it is useful to evaluate separately the cases where g + a < 1 and g + a > 1. Beginning with the 
case g + a < 1, which we denote by ga, we have 

(1 — g + gz) m+1 (l — g + m — gm — gz) 

r u {Z <z\g,a,m,ga) = z 1 - g + gz) H ■ — r— — — 

— (1 — g)a(m + l)(m + 2) 

[1 — g + .92 + a(l — z)] m+1 [l — g + m— gm — gz + a(— 1 — m + z + mz)] 

(l-5)a(m + l)(m + 2) 

We now wish to compute 

P(Z*<z\m,ga) =11 " P(Z* < z\g, a,m)f A (a)f G (g)dadg 
Jo Jo 

< / / P„(Z* < a,m,ga)f A (a)f G (g)dadg 
Jo Jo 



(165) 



< z|m,#a). (166) 



The evaluation of this integral is given in Section |A.2[ which shows 

m+l 

V U (Z* < z\m,ga) = pi(m,z) + }^ p 2 j(m,z) + p 3 (m, z) + p4,(m, z), (167) 
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where 

2z (2 + to 2 (-1 + z) 2 + m(-l + z)(-3 + 5z) - 2z (3 - 3z + z 2+m )) 



pi(m,z) 
p 2j (m,z) 

p 3 (m,z) 
Pi{m, z) 



(1 + m)(2 + m)(3 + m)(-l + z) 3 
2z 3+m (j + (2 + m) (-2 + z) - jz) 
j(-3 + j - to) (-2 + j - to)(1 + to) (2 + to)(-1 + z) 2 ' 
2z j (-j(l + m)(-l + z) + (2 + to)(-1 + m(-l + z) + 2z)) 
j(-3 + j - to)(-2 + j - m)(l + to)(2 + m)(-l + z) 2 ' 
2iJ(m + 1) (-1 + m(-l + z) + 2z + (-2 + z)z 3+m ) 
(1 + to)(2 + m)(3 + m)(-l + z) 2 ' 
2 2z 2+m 2z 3+m 



+ 



(2 + to) 2 (3 + to)(-1 + z) (2 + to) 2 (2 + to) 2 (3 + to)(-1 + z)' 



Since we arc ultimately interested in the expected value of Z* , we evaluate 

E u [Z*\m,ga}= [ P U (Z* < z\m,ga)dz. 
Jo 

Using £j(m) = J Q Pj(m, z)dz, we have 

m+l 

E„[Z*|TO,ga] =6M+ Xl^( m ) + ^ 3 ( m ) + ^( m )' 



where 

£i(m) 

e 3 (m) 

£ 4 (m) 



2i2"(m + 1)(3 + to - £T(m + 3) (2 + m)) 
(to+ 1)(to + 2)(to + 3) _ ' 
2 (-(-3 + - to) (2 + to) + + (2 + to) 2 ) - (j + (2 + to) 2 ) g(m + 3)) 
j(-3 + j - to)(-2 + j - to)(1 + to)(2 + to) 

2(#(to + 3) - 1) 
(2 + to) 2 (3 + to)' 
(2 + m)(17 + 5m) - 2(3 + to)(4 + m)H(m + 2) 



(to+ 1)(to + 2)(to + 3) 
This completes the case for g + a < 1. Now considering g + a > 1, denoted by ga, we have 

P u (Z*<z|g,a,m,ga) = z(l - g + gz)" - (1 ~ 2g + (1 ~ 

(1 — g) 2 (l + to)(2 + to) 

((l-g)(l + m)-gz)(l-g + gz) 1 +" 
(1 - 5 ) 2 (1 + to)(2 + to) 

Continuing as we did with the first case, 

¥(Z* < z\m,ga) = [ [ P(Z* < z\g,a,m)f A (a)f G (g)dadg 

JO Jl-g 

r-1 /■! 



< / / V u (Z*<z\g, a, m, ga) f A (a) f G (g)dadg 

Jo Jl-g 

= / gV u (Z* < z\g,a 1 m,ga)f G (g)dg 
Jo 



\(Z* <z\m,ga), 
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where we have used the fact that the expression V U (Z* < z\g, a, to, ga) is not a function of a. Evaluation of 
this integral is given in Section |A.3| the expression is 

2z(l + m-3z-mz + z 2+m (3 + m- (1 +771)2)) -2z ^ z ™+i-j 

Vu(Z <z\m,ga) = (1 + m)(2 + to)(3 + m)(-l + z) 3 + (m+l)(m + 2) j 

2z 2(1 + m + z) (6 + 2m)z m + 2 

+ (m + l)(m + 2) 2 (1 - z) + (m + l)(m + 2) 2 (to + 3)(1 - z) 2 ~ (m + l)(m + 2) 
z m+2 2ff(m+l)z m + 2 2(l + TO + 2z)z m + 2 



(174) 



m + 1 (m + l)(m + 2) (to + l)(m + 2) 2 (1 - z) 
2(1 + to + z)z m + 3 
~~ (to + 1)(to + 2) 2 (to + 3)(1 - z) 2 ' 

We again calculate the following term for the expected value 

E u [Z*\m,ga] = [ P U (Z* < z\m)dz 
Jo 

20 + IOto + to 2 - 2(3 + m)if(l + to) 2 

(2 + to)(3 + to) 2 h ^-J j(-3 + j- to)(1 + to)(2 + m) ' ^ ^ 

Bringing together both cases 5 + a < 1 and g + a > 1, and noting that we have assumed the event C\ 
(meaning that the first item is not critical), we have 



E[Z*|m,Ci] = / (1 - P(Z* < z\m))dz 
Jo 



> 1 - / ¥ U (Z* < z\m)dz 







= I —E u [Z*\m,ga] —E u [Z*\m,ga] 

= 1 + -. T7 —T-. r . -T7T { (186 + 472m + 448m 2 + 203to 3 + 45to 4 + 4to 5 ) 

(to + l)(m + 2) 3 (to + 3) 2 lv ' 

+ (-244 - 454m - 334to 2 - 124to 3 - 24m 4 - 2m 5 )H(m + 1) 

+ (-48 - 88to - 60to 2 - 18to 3 - 2m i )[H{m + 2)] 2 } 

^ 2 (-4 + j - 4m + jm - m 2 - (j + (2 + m) 2 ) H(j) + (j + (2 + m) 2 ) i?(3 + m)) 
j(-3 + j - m)(-2 + j - m)(l + m)(2 + m) 

= E ; [Z*|m,Ci]. (176) 

If the first item is critical, 

< z\g, a, m) = ¥(Z* < z\g, m) = (1 - g + gz) m . (177) 



Marginalizing over G gives 

V(Z*<z\m) = I ¥(Z* <z\g,m)f G (g)dg 



f\l-9 + gz) m (2-2g)dg 
Jo 

2 (1 + m - 2z - mz + z 2+m ) 
(1 + to)(2 + to)(-1 + z) 2 



(178) 
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E(Z*\m,Ci) = 

Since the event C\ indicates m = n — 1, 

E(Z*\d) = 



1 

1 + 



- f P(Z* < z\m)Az 
Jo 

2 2H(m + l) 



2 + m 



1 + 



m + 1 



2JT(n) 



(179) 



(180) 



n + 1 n 

Finally, accounting for the distribution of m gives the theorem. 

Theorem 6. For the knapsack problem, the gain Z* after running a single iteration of the exhaustive rollout 
algorithm satisfies 



E 



Z* 



z2 Wi > B 



> 1 + 



2 2H(n) 
n(n+l) n 2 



n-2 



m=0 



(m+ l)(m + 2) 3 (m + 3) 2 

-(244 + 454m + 334m 2 + 124m 3 + 24m 4 + 2m 5 )H(m + 1) 
-(48 + 88m + 60m 2 + 18m 3 + 2m 4 )[i?(m + 2)] 2 ] 



[(186 + 472m + 448m 2 + 203m 3 + 45m 4 + 4m 5 ) 



m+l 

E 



2 (-4 + j - 4m + jm - m 2 - (j + (2 + m) 2 ) H(j) + (j + (2 + m) 2 ) ff (3 + m)) 
i(-3 + i-m)(-2 + j-m)(l + m)(2 + m) 



• (181) 



The nested summation term may be omitted without significant loss in the performance bound. This is 
accomplished by showing that the argument of the sum is always positive. 



Lemma 10. For all m > and 1 < j < m + 1, 

(-4 + j - 4m + jm - m 2 - (j + (2 + m) 2 ) + (j + (2 



H(3 + m)) 



j(-3 + j - m)(-2 + j - m)(l + m)(2 + m) 



> 0. 



(182) 



Proof. The denominator is always positive, so we focus on the numerator. There numerator consists of two 
parts, 



M(j',m) = (4-j)(m + l)+m 2 , 

m+3 

^ 2 (j,m) = (j + (2 + m) 2 ) V — . 



(183) 
(184) 



i=j+i 



Our goal is to show that N 2 (j, m) > Ni(j, m) always holds. The difference equation for N 2 (j, m) with respect 
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to j satisfies 



A(N 2 (j,m)) 4 N 2 (j + l,m)-N 2 (j,m) 

m+3 m+3 1 m+3 ^ 

= E 7 + (.? + (2 + ™) 2 ) Yl 7 -(.? + (2 + m) 2 ) £ 7 

i=j+2 i=j+2 i=j+l 

m + 3 1 ■ , /n i ^2 

1 J + (2 + m) 2 
~~ 4^ i 7 + 1 

to - j + 2 j + (2 + m) 2 

7+2 j + 1 

to - j + 2 j + (2 + m) 2 

J + l ./ • I 

-2 - 3m - m 2 - 2j 

J+l 

< - 4 ~ 3 " 2 . (185) 
to + 2 v ; 



< 



< 



For the other term, we have 



A(7V 1 (i,m)) = -(m + l). (186) 



Both N\(j,m) and -/V 2 (j, to) are decreasing in j and N 2 (j, to) decreases at a greater rate. We approximate 
N2(j, m ) with the following: 



m + 3 1 rm+4, i / , 4 

2T(m + 3) -*(,") = 2 7> / > = lo sHTr 
i= ,- , i * Jj+i x \ J + 1 



(187) 



i=j+l 
Looking at j = 1, 

JVi(l,m) = 3 + 3to + to 2 , (188) 
JV 2 (l,m) = (5 + 4TO + TO 2 )log ( J , (189) 

guaranteeing A^2(1,to) > Ni(l,m). With consideration of starting points and slopes for the two numerator 
terms, ensuring that N 2 (m + 1, m) > N\{m + l,m) is sufficient for the lemma. We have 

Ni{m + \,m) = 3 + 2to, (190) 

AT 2 (to+1,to) - (- + 1 + ( 2 + -) 2 )(^ + ^T3 

> (5 + 5to + to 2 ) 
10 + 10m + 2to 2 

TO + 3 

> 3 + 2to. (191) 

□ 



TO + 3 
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This establishes the following. 



Corollary 2. For the knapsack problem, the gain Z* after running a single iteration of the exhaustive 
rollout algorithm satisfies 



E 



Y,Wi>B 



iei 



> 1 



2 2H(n) 
n(n + 1) n 2 



n-2 



n ^ \(m + l)(m + 2) 3 (m + 3) ; 

m—0 v 



[(186 + 472m + 448m 2 + 203m 3 + 45m 4 + 4m 5 ) 



-(244 + 454m + 334m 2 + 124m 3 + 24m 4 + 2m 5 )H(m + 1) 
-(48 + 88m + 60m 2 + 18m 3 + 2m 4 )[ J ff(m + 2)] 2 ] } 



(192) 



We are now ready to study the asymptotic behavior of the bound from Theorem [6j We will show that 
linijj^oo E[Z* |-] = 1, so we are interested in bounding the rate at which 1 — ~E[Z* |-] approaches 0. Accordingly, 
we are only concerned with the negative terms in ( 181 ). The magnitudes of these terms are 

H(»> = ^ (193) 

1 ^ (244 + 454m + 334m 2 + 24m 4 + 2m 5 )H(m + 1) 
2W - 2^ (m+ l)(m + 2) 3 (m + 3) 2 (1 ' 

1 (48 + 88m + 60m 2 + 18m 3 + 2m 4 )[g(m + 2)] 2 
3(n) (m + l)(m + 2) 3 (m + 3) 2 ' ( ' 

The second and third terms are decreasing in m, so they are bounded by their respective integrals. Using a 
logarithmic bound on the harmonic numbers, we have 

Zi(n) = 0(*¥), (196) 



r a (») = - n fo( 1 ^ 

1 j n - l (\ogm 
n 



n 



T 3 (n) = 



n-2 /, 2 
1 x -> / log m 



n ' I rn 2 

i r- l (^ ]dlll 

n 



= o(^|. 
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This also shows that lim„_ i . 00 E[Z*|-] = 1 since the gain has a natural upper bound of unit value. The final 
theorem then follows. 



Theorem 7. For the knapsack problem, the gain Z* after running a single iteration of the exhaustive 
rollout algorithm satisfies 



lim E 

n— too 



1 -E 



Z* 



z* 



iei 



O 



log n 



(199) 
(200) 



In comparison to the asymptotic result on the subset sum problem (Theorem pH), this result has an extra 
log n in the numerator. This can likely be explained as follows. For the subset sum problem, the gap 
approaches zero when an item is found that is near the size of the greedy gap. In the knapsack problem, the 
gain approaches one when an item is found that has a profit near unit value and a weight smaller than the 
greedy gap. Since this necessary item must satisfy two criteria, it is not found as quickly as the corresponding 
item for the subset sum problem. While we only determined the asymptotic upper bound for the knapsack 
problem, it is conjectured that the lower bound is of the same order. 



5 Conclusion 

We have shown strong performance bounds for both the consecutive rollout and exhaustive rollout techniques 
on the subset sum problem and knapsack problem. For the subset sum problem with n > 3, the consecutive 
rollout algorithm reduces the expected gap from | ~ 0.333 to at most ^ ~ 0.233, and the exhaustive rollout 
algorithm reduces the expected gap to at most S ~ 0.211. Similarly for the knapsack problem, the gain of 
the consecutive rollout algorithm is at least ||| ~ 0.175 and the gain of the exhaustive rollout algorithm is 
at least ^ ~ 0.226 (again with n > 3). These results hold after only a single iteration and provide bounds 
for additional iterations. Simulation results indicate that these bounds are very close in comparison with 
realized performance of a single iteration. We have shown that asymptotically (with respect to the total 

number of items), the expected performance of the exhaustive rollout algorithm converges to a constant at 

l l 2 

rate 9(^) for the subset sum problem and rate 0{ ° e n n ) for the knapsack problem. 

Finding improved/tight bounds is certainly opportunity for future work, though it is not obvious how 
to improve the approximation. For both the consecutive and the exhaustive rollout techniques, considering 
the contribution of each additional packed item to the final minimum gap adds a dimension to the space of 
instances which must be integrated over, and these integrals are already complex. In general, finding the 
expected value of the minimum of many random variables can be difficult. 

Another interesting direction is to consider a second iteration of the rollout algorithm. The worst-case 
analysis of rollout algorithms for the knapsack problem in [T] shows that running one iteration for a particular 
base policy results in a notable improvement, but it is not possible to guarantee additional improvement with 
more iterations. This is most likely not a limitation in the average-case scenario, but a few difficulties arise 
here. First, such an analysis likely requires finding the full distribution for the gap after the first iteration, 
rather than just the expected value. More importantly, the useful property that non-critical item weights 
remain i.i.d. on U[0, 1] (specifically Lemma [2| does not seem to hold for the remaining items after the first 
iteration of the rollout has occurred. 

A somewhat related topic is to still consider only the first iteration of the rollout algorithm, but with 
a larger lookahead length (e.g. trying all pairs of items for the exhaustive rollout, rather than just each 
item individually). From the worst-case perspective, it seems easier to analyze this case than the effect of 
additional iterations. 

Finally, it is desirable to have theoretical results for more complex problems. Studying problems with a 
multidimensional state space is appealing since these are the types of problems where rollout techniques are 
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often used and perform well in practice. In this direction, it would be useful to consider problems such as 
the bin packing problem, the multiple knapsack problem, and the multidimensional knapsack problem. 
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A Appendix 



The following lemma is used in integral evaluations described in this section. 
Lemma 11. For constant values K\, k 2 and nonnegative integer 9, 



{K1 + K2X) \ X = k{ log(x) + £ *1!±k±±JWL . (2 oi) 



Proof. We begin by noting that 



X J J X 

J X 

The property clearly holds for 6 — 0. Assuming that it holds for 6 = t, we have for 9 = t + 1, 



/ 



' dx = i ——i + ki k\ log (x) + \ 

X t + 1 i ' 



log(x) + Y: " 1+W( "! +K2a:)J . (203) 



3=1 

The property then holds for all 9 by induction. 



□ 



A.l Integral evaluation of (132) and (133) 



Integration over the random variables A and G for the upper and lower bounds on P(V* > v\m) are 



1 rl 



P u (V*>v\m) 4 / / [P U (V 4 >i,| a , 5 )] m P u (^' >v\a,g)P(G>v\a,g)f A (a)f G (g)dadg, (204) 



o Jo 

P,(V*>v|m) 4 / 1 [P ; (^> w |a, ff )] m P i (F'> V |a,. 9 )P(G> W |a, 3 )/ yl (a)/ G ( 5 )dad.g. (205) 

Jo Jo 

The integrals may be evaluated by considering regions where the arguments have simple analytical descrip- 
tions as a function of a and g. We begin by noting that P(G > v\a,g,m) = I(v < g), so we may restrict our 
analysis to regions where v < g. For the integral evaluation of (a,g) € Rj, we use the notation 



Pj (v,m) = J J r [Y u {Vi > v\a,g)} m P u (V > v\a,g)P(G > v\a, g)f A (a)f G (g)dadg (206) 

for the upper bound and 

p' 3 {v,m) = j f [¥i{Vi > v\a,g)] m Fi{V > v\a,g)P(G > v\a,g)f A (a)f G (g)dadg (207) 

for the lower bound. Starting with the upper bound, the relevant regions are shown in Figure [8j The values 
of [P M (F > v\a,g)] m and P U (V > v\a,g) are shown in Table |2j Note that in many cases, the factor 
from P(G > v\a,g) cancels with the (1 — g) factor from f G (g), which simplifies the expression. 
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Table 2: Arguments of (132) for regions shown in Figure [8j 



Region 


rim ( ~\ r ^ 

[F u (Vi > 


1 Mm 


TFT) 


(" > w|a,ff) 


Ri 


(1 — V 


— a) m 


(1 - 


.9 - o)/(l - g) 


R2 


(9- 


v yn 







R3 




Zv) 


(! - 


g-v)/{l-g) 


i?4 


{a + g 


- 2v) m 


(a 


-v)/(l-g) 




{a + g 


- 2v) m 


(a 


-v)/(l-g) 


Re 


(1- 


v) m 




1 


R7 


(1 - w 


-a) m 


(1- 


.9 - o)/(l - <?) 


Rg 


(3- 


v yn 







Rq 


(a + .g 


- 2v) m 


(a 


-«)/(! -fl) 


Rio 


(1- 


v) m 




1 



1 + 



3 + a > -y + 1 


















#1 








g 

g + a < 1 

(a) ' (b) 

Figure 8: Integration regions for (a) v < ^ and (b) w > |. 



+ a < 1 



Regions 1-6 correspond to the case where v < \. 



2 ■ 

pv pi — a pv 

pi(v,m) = 2{l-v-a) m (l- g-a)dgda = / (1 - a - v) 2+m da 

JO Jv JO 

-(l-2v) 3+m + (l-v) 3+m 
3 + m 

p 2 (v,m) = 0. 

pi— v pi— a pl—v 
p 3 (v,m) = / 2{l-2v) m (l- g-v)dgda = / {3v - a - 1)(1 - 2v) m {v + a 

J v J V J V 

= 2 (1 _ 2u)3+m _ 
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j-l-V rv+l-g 

Pi{v,m) = / 2(a + g-2v) m (a-v)dadg 

J v J 1 — g 

= - i- , f {-2(1 - 2v) l+m + 4g(l - 2v) 1+m - 2m(l - 2v) 1+m + 2gm{\ - 2v) 1+m 

+2(1 - w) 1+m - 4ff(l - w) 1+m + 2m(l - u) 1+m - 2ffm(l - w) 1+m + 2m(l - 2v) 1+m v 
+ 2(l-v) 1+m v}dg 

= 1 f m(l - 2v) 3+m + m(l - v) m + 2(1 - «) m v - 3m(l - w) m w 

(l + m)(2 + m) 

-6(1 - v)"V + 2m(l - u)"V + 4(1 - v) m v 3 } . (212) 



(213) 



p 5 (v,m) = 2(a + g-2v) m (a-v)dadg 

J 1 — V J V 



i r 1 

-, , / {2(1 - v) 1+m - 4g(l - v) 1+m + 2m(l - v) 1+m - 2gm(l - v) 1+m 

+2(.g - u) 2+m + 2(1 - v) 1+m v} dg 

— 1 . . {-2(1 - 2vf +rn + 2(1 - v) 1+m - 10(1 - v) 1+m v - 2m(l - v) 1+m i; 

(l + m)(2 + m)(3 + m) 1 v ; v 7 v 7 v ; 

+14(1 - v) 1+ "V + 7m(l - v) 1+m v 2 + m 2 (l - v) 1+m v 2 } . (214) 



P6(«,m) = f [ (l-v) m (2-2g)dadg= f 
Summing all terms of F U (V* > v\m) for v < \ gives 



(2 - 2g){\ - v) m (g - «)dg = hi - vf+ m . (215) 



t (V* > u|m)<i = pi(v,m) + p 2 (v, m) + p 3 {v,m) + pi(v,m) + p 5 (v,m) + pe(v,m) 

1 



{2m(l - 2«) m + m(l - u) m + 9(1 - v) 3+m - 12m(l - 2v) 7 



3(3 + m) 

-3m(l - v) m v + 24m(l - 2v) m v 2 + 3m(l - w) m w 2 - 16m(l - 2u) m t; 3 
-m(l-v)"V}. 

(216) 
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Regions 7-10 are for the case v > h. 

-i /-1-s 



p7(v,m) 



2{l-v - a) m {l- g - a)dadg 



1 

(l + m)(2 + m) y„ 



{2(1 - v) 1+m - 4 5 (1 - u) 1+m + 2m(l - v) 1+m - 2gm{\ - v) 



l+m 



+2{g - v) 2+m + 2(1 - v) 1+m v} dg = 



(1-v) 



3+m 



3 + m 



ps(v,m) 
pg(v,m) 



= 0. 



(217) 
(218) 



1 fl+v-g 



(1 + m)(2 + m) J v 



2(a + g~2v) m (a~v)dadg 



{2(1 - v) 1+m - 4.g(l - v) 1+m + 2m(l - v) 1+m - 2.gm(l - v) 



l+m 



+2(<? - vf +m + 2(1 - t>) 1+m v} dg 



(1-v) 



3+m 



3 + m 



(219) 



pio(w,m) 



1 



(1 - v) m {2 - 2g)dadg - / (2 - 2. 9 )(1 - v) m (g - v)dg = -(1 - v) 3+m . (220) 



IV Jl+v—g 

Summing these terms yields for v~>\, 

¥ u (V*>v\m) > i = p r (v,m) + p s (v,m) + p 9 (v,m) + pia(v,m) 

3 3 + m 



At this point we have shown 

¥{V* > v\m) < V U (V* > v\m) 



V U (V* > «|m)<i v < i 
P„(y* > u|m) > ! v>§. 



(221) 



(222) 



The lower bound requires integrating only over three regions, which are listed in Table [3j 



Table 3: Arguments of ( 133 1 for integration regions 



Region 


[Ft(Vi >v\a,g)] m 


Fi(V>v\a,g) 


R' } 
R' 2 

R' 3 


(1 - v) m 
(g + a~2v) m 
(g-v) m 


1 

(a-v)/(l-g) 
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Pi(v,m) = [ [ (l-v) m (2-2g)= f (l- v ) m (g- v){2-2g)dg =\(l-v) 3+m . (223) 

J v Jl-\-v—g J v 

r-l i-l+v-g 

p' 2 {v,m) = (g + a-2v) m (a-v)dadg (224) 



1 



(l + m)(2 + m) J v 



{2(1 - v) 1+m - 4g(l - v) 1+m + 2m(l - w) 1+m - 2 5 m 



(1 - v) 1+m + 2{g - v) 2+m + 2(1 - v) 1+m v}dg = ^—^ . (225) 

p' 3 (u,m) = 0. (226) 



Wc then have 



1 (1 — v) 3+rn 

?l(V* > v\m) = p'^m) + p' 2 {v,m) + p' 3 (v,m) = -(1 - u) 3+m + ^ . (227) 

3 3 + m 



A. 2 Integral evaluation of (166) 
We wish to evaluate 



/•i /"l-s 

,(Z*<z|ra,5<z) =11 ¥ U (Z* < z\g,a,m,ga)f A (a)fG(g)dadg, (228) 



Jo Jo 

where 

TO / 7 * i \ /, , Am _, {1- g + gz) m+1 (l - g + m- gm- gz) 

V U (Z < z\g,a,m,ga) = z 1 - g + gz) + — — — — — — 

— (1 — g)a[m + l)(m + 2) 

[1 — g + gz + a(l — z)] m+1 [l — g + m — gm — gz + a(— 1 — m + 2 + mz)] 



(1 - ,g)a(m + l)(m + 2) 

We first determine 



(229) 



\(Z* <z\g,a,m,ga)f A (a)da = J V U (Z* < z\g,a,m, ga)da. (230) 

The following constants simplify the expression: 

Ai = 1 - g + gz 
A 2 = z-1 

A3 = 1 — g + m — gm — gz 
A4 = —l — m + z + mz 

As = (l-.g)(m + l)(m + 2)- (231) 

This gives 

r ( A A A m+1 (\ 4-nA V m + 1 1 
\(Z* <z\g,a,m,ga)da = / zA?' - 5 3 1 + A 5 A 3 1 1+ 2) + AsA^ + aA 2 ) m+1 da 



az\T - A 5 A 3 Ar +1 log(a) + A 5 A 3 I Af +1 log(a) + £ ^ ^ + ^ 



A 5 A 4 . xm+2 

+ A 2 (m + 2) (Al + A2a) 

= azAl" + A 5 A 3 £ A? " +1 " (Al ± A2Q)J + yP^-AXr + A 2 a)"+ 2 , (232) 

^ j A 2 (m + 2) 
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where we have made use of the integral identity from Lemma 11 Evaluating over the domain of integration 
gives 

pi-g ( m + 1 yn+i-j z j 
/ Pu(Z* < z\g,a,m,ga)da = (1 - g)z\? + A 5 A 3 V) X[ n+1 H(m + l) 

A 5 A 4 (z m + 2 - A™+ 2 ) 
+ A 2 (m + 2) ' 

(233) 

Next, we calculate 

-i /-l-s /-i / /-i-s \ 

< z\g,a,m,ga)fA{a)fG(g)dadg = [ ¥ U (Z* < z\g,a,m, ga)da\ (2 - 2g)dg. 

(234) 



o Jo 



We integrate each additive term separately 
Pi(m,z) 



f\l-g)z\ m (2-2g)dg 
Jo 

[ 2(l-gf Z {l-g + gz) m dg 
Jo 



2z (2 + m 2 (-l + zf + m(-l + z)(-3 + 5z) - 2z (3 - 3z + z 2+m )) 
(1 + m)(2 + m)(3 + to)(-1 + z) 3 



f 1 A7 l+1_J '^ 

= / A 5 A 3 ^- (2-2 5 )d 5 

Jo J 



dfj 



1 2(1 -g + m- gm - gz)(l -g + gz) m+1 -^J 
(m + l)(m + 2)j 
2z 3+m (j + (2 + m)(-2 + z) - jz) 
j(-3 + j - m)(-2 + j - m)(l + m)(2 + m)(-l + z) 2 
2z J (-j(l + m)(-l + z) + (2 + m)(-l + m(-l + z) + 2z)) 



p 3 (m,z) = 



j(-3 + j - m)(-2 + j - m)(l + m)(2 + m)(-l + z) 

/ -A 5 A 3 Ai ra+1 i7(m + l)(2-2.g)d 5 
■/ o 

1 2H(m + 1)(1 - g + m - gm - gz)(l -g + gz) m+1 



(m+l)(m + 2) 
2H(m + 1) (-1 + m(-l + z) + 2z + (-2 + z)z 3+m ) 



dg 



(1 + to) (2 + to) (3 + m)(-l + z) 2 

Z 1 A 5 A 4 (z m + 2 - A'" +2 ) ,„ „ , , 
= I A 2 (to + 2) J < 2 - 2 ^ 

2(-l - to + z + TOz)(z m + 2 - (1 - ,g + 5z) m+2 ) 

(to + 1)(to + 2) 2 (-1 + z) 5 
2 2z 2+m 2z 34 



(2 + m) 2 (3 + m)(-l + z) (2 + m) 2 (2 + to) 2 (3 + m)(-l + z) ' 



(235) 



(236) 



(237) 



(238) 
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With these terms, we have 

¥ U (Z* < z\m,ga) = pi(m,z) + P2j(m,z) + p 3 (m,z) + p 4 (m, . 



m+l 



3=1 



A. 3 Integral evaluation of (173) 
The integral is 

< z\m, ga) = 



X z * < z\g,a,m,ga)f G (g)dg, 



where 



w 7 */ i — ^ n , V n (l~2g + (l~g)m)z 2+m 

\{Z <z\g,a,m,ga) = z{l-g + gz) -— ^— -— 

(1 — g) (I + m)(2 + m) 

((l-g)(l + m)-gz)(l-g + gz) 1 + m 

(1 -g) 2 (l + m)(2 + m) 



For the first term in ¥ U (Z* < z\m,ga), 

I gz(l-g + gz) m (2-2g)dg = 
Jo 



2z (l + m - 3z - mz + z 2+m (3 + m - (1 + m)z)) 
(1 + m)(2 + m)(3 + m)(-l + z) 3 ' 

To find the indefinite integral of the second term in ¥ U (Z* < z|m,ga), we use the substitution 

2g(l - 2g + (f - g)m)z 2+m 



(1- g)(l + m)(2 + m) 
2(1 - e)(l - 2(1 - e) + em)z m + 2 

e(m + l)(m + 2) 
(-2 + 2e(m + 3) - 2e 2 (m + 2))z m + 2 
e(m + l)(m + 2) 



d<7 

de 

de 



_ 2z m+2 [ 2(m + 3)z m + 2 , /" -2ez m + 2 , 
-de + / — rrde+ / -de 



e(m + l)(m + 2) 7 (m + l)(m + 2) 7 (m + l) 

-2z m + 2 , , . 2e(3 + m)z m + 2 e 2 z m+2 
log(e) + 



(m + l)(m + 2) w (m + l)(m + 2) (m + l)' 
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For the indefinite integral of the final term in P(Z* < z\m,ga), we again use the substitution g = 1 — e. 
2g((l - g)(l + m) - gz)(l -g + gz) 1+m 



(1-. 9 )(1 + to)(2 + to) 9 

-2(1 - e)(e(m + 1) - (1 - e)z)(l + (1 - e)(z - l)) m+1 , 

t w \ de 

e(m + l)(m + 2) 

(2z - 2e(l + m + 2z) - 2e 2 (-l - m - z))(z + e(l - z)) m+1 

e(m + l)(m + 2) 



de 



2z(z + e(l - z)) rn+1 , /• -2(1 + to + 2z)(z + e(l - z)) m+1 , 
de + / ; — — de 



e(m + l)(m + 2) J (to + 1)(to + 2) 

-2e(-l - to - z)(z + e(l - z)) m+1 



(m + l)(m + 2) 



de 



^^io g (e )+ x: zm+1 " J(z+e(1 - z))J 



(m + l)(m + 2) , 

2(1 + m + 2z)(z + e(l - z)) m + 2 2e(-l - m - z)(z + e(l - z)) m + 2 

(to + 1)(to + 2) 2 (1-z) (to + l)(m + 2) 2 (l - z) 
2(-l - m - z)(z + e(l - z)) m + 3 



(m + l)(m + 2) 2 (to + 3)(1 - z) 2 ' 
Note that we have used the integral identity from Lemma [TT] For the second and third terms, we have 

f 1 ( ((1 - + m) - ,gz)(l - . 9 + gz) 1+m (1 - 2.g + (1 - g)m)z 2+m 
Jo V (l-.g) 2 (l + m)(2 + m) ~~ (1 - ,g) 2 (l + m)(2 + m) 



(244) 



dg 



2z y^ 1 z m+1 ^'(z + e(l - z))-? 2(l + TO + 2z)(z + e(l-z)) m + 2 



(to+1)(to + 2) j (to + l)(m + 2) 2 (1 - z) 

2e(-l - to - z)(z + e(l - z)) m + 2 2(-l - to - z)(z + e(l - z)) m + 3 
(to + 1)(to + 2) 2 (1-z) + (m + 1)(to + 2) 2 (to + 3)(1 - zf 

2e(3 + TO)z m + 2 _ e 2 z" i+2 6=0 
(to+ l)(m + 2) (to + 1) e=1 

-2z ^ z m+i-j ^ 2z ; 2(1 + to + z) 



(to + 1)(to + 2) ^-J j (to + 1)(to + 2) 2 (1-z) (to+1)(to + 2) 2 (to + 3)(1-z)< 

(6 + 2to)z™+ 2 z rn+2 2H(m + l)z m+2 2(1 + to + 2z)z m + 2 
(TO + l)(m + 2) to + 1 (to + 1)(to + 2) (to + l)(m . + 2) 2 (1 - z) 
2(1 + to + z)z m + 3 
~ (to + l)(m + 2) 2 (to + 3)(1 - z) 2 ' 
Altogether, 

2z(1 + to-3z-toz + z 2 +" 1 (3 + to- (l + m)z)) -2z ^ z m+l-j 

¥ U (Z <z\m,ga) = + ^ 



(245) 



+ 



(1 + to)(2 + to) (3 + m)(-l + z) 3 (to+1)(to + 2)^ j 

2z 2(1 + to + z) (6 + 2TO)z m + 2 



(to+ 1)(to + 2) 2 (1 - z) (to + 1)(to + 2) 2 (to + 3)(1-z) 2 (to + 1)(to + 2) 
z m + 2 2if(m + l)z m + 2 2(1 + to + 2z)z m + 2 
to, + 1 + (to + l)(m + 2) ~~ (to + 1)(to + 2) 2 (1-z) 
2(1 + to + z)z™+ 3 



(to + 1)(to + 2) 2 (to + 3)(1 - zf 



(246) 
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